Pollinator Indicators

Authors
Affiliation

Kwaku Peprah Adjei

Norwegian institute for Nature Research

Anders Kolstad

Norwegian institute for Nature Research

Markus A.K. Sydenham

Norwegian institute for Nature Research

Published

August 19, 2024

Status

This indicator documentation is incomplete and needs further developement before indicator values can be calculated.

Recomended citation

Adjei, K. P, Doe, J. , Doe, J. My Great Indicator My Great Indicator v. 000.001 ecRxiv https://github.com/NINAnor/ecRxiv

Version

000.001

Table 1: Indicator metadata
Indicator ID NA
Indicator Name My Great Indicator
Country NA
Continent NA
Ecosystem Condition Typology Class NA
Realm NA
Biome NA
Ecosystem NA
Year added 1901
Last update NA
status incomplete
Version 000.001
Version comment NA
url https://github.com/NINAnor/ecRxiv

Pollinator indicators





1. Introduction

We aim to model the diversity of pollinators in Norway given their know interaction with some plant species. From the estimated diversity across the nation, we intend to look at the estimates in the ASO Data meadows by using their values as the reference values. To achieve this, we do the following:

  • obtain data from GBIF for bumblebees, butterflies and bees

  • fit an integrated species distribution model (ISDM) to the data to obtain pollinator intensities

  • obtain a web plot of the pollinator and plant species

  • get data on the interaction plant species from GBIF

  • fit an ISDM to the plant species data to obtain the species occurrence probability

  • estimate the alpha diversity index of the pollinators across Norway

2. About the underlying data

The pollinator and plant species data used to fit the ISDMs were obtained from GBIF. Cite the DOI of the downloaded data.

2.1 Spatial and temporal resolution

Both pollinator and plant data obtained was for the entire Norway observed within \(2005\) to \(2024\). For each dataset within the downloaded GBIF data, we formatted them based on their meta data into presence-only and presence-absence data. Then, we merge all the presence-absence datasets into one dataset: mergedDatasetPA and all the presence-only datasets into one dataset: mergedDatasetPO. The number of occurrences in each dataset for the pollinators are presented in Table 2.

Table 2: Number of occurrence records for each pollinator in the merged presence-absence and presence-only datasets
Dataset Pollinator No. of observations
Presence-only Bees 84485
Butterflies 474857
Hoverflies 79683
TOTAL 639025
Presence-absebce Bees 40429
Butterflies 39321
Hoverflies 51861
TOTAL 131611
Figure 1: Distribution of 54 plant species in the merged presence-absence dataset.
Figure 2: Distribution of 54 plant species in the merged presence-only dataset.
Table 3: Number of plant species occurrences from the merged presence-absence and presence-only datasets obtained from GBIF.
Plant species Presence-absence Presence-only
Ajuga pyramidalis 72650 4427
Astragalus alpinus 155978 2283
Campanula rotundifolia 218345 13558
Carum carvi 47681 2962
Filipendula ulmaria 163248 18103
Galium album 65106 3734
Galium aparine 45202 1926
Galium boreale 67332 8338
Galium elongatum 43791 1503
Galium palustre 84951 2598
Galium saxatile 5048 1619
Galium uliginosum 49728 2061
Galium verum 23600 3619
Geranium robertianum 49141 6148
Geranium sylvaticum 238108 15692
Gymnadenia conopsea 458 3659
Hieracium murorum 764 30
Hieracium umbellatum 9808 3985
Hieracium vulgatum 13483 39
Hypochaeris radicata 1988 1399
Knautia arvensis 50898 6609
Lathyrus linifolius 91357 6438
Lathyrus pratensis 77899 5923
Lathyrus vernus 56287 2228
Leucanthemum vulgare 125584 9992
Lotus corniculatus 64438 12485
Nardus stricta 47131 7179
Origanum vulgare 767 2484
Plantago lanceolata 8672 5638
Plantago major 6730 7200
Plantago media 1389 2442
Silene dioica 56046 7645
Silene vulgaris 33839 3742
Stellaria graminea 180580 8687
Stellaria media 47587 4101
Stellaria nemorum 62024 4250
Trifolium medium 35383 4670
Trifolium pratense 147575 12698
Trifolium repens 97671 11745
Trollius europaeus 44249 5928
Valeriana sambucifolia 68074 108
Veronica alpina 17738 626
Veronica chamaedrys 175299 9228
Veronica officinalis 223994 11936
Veronica serpyllifolia 5988 3153
Vicia cracca 83519 10301
Vicia sepium 112154 6270
Vicia sylvatica 45135 2214
Viola biflora 3518 3612
Viola canina 123557 3602
Viola epipsila 611 499
Viola palustris 83363 8928
Viola riviniana 425049 9602
Viola tricolor 88517 4075

Iintend to keep either the table or the figures in the final document, depending on what is needed for the report.

2.2 Original units

2.3 Additional comments about the dataset

3. Indicator properties

3.1. ECT

3.2. Ecosystem condition characteristic

3.3. Other standards

3.4. Collinearities with other indicators

4. Reference condition and values

4. 1. Reference condition

4. 2. Reference values

4.2.1 Minimum and maximum values

Reference values for diversity indices at the ASO data meadows.

ggplot(asoDataMeadows) +
  gg(asoDataMeadows["overallDiversity"], aes(col = overallDiversity)) +
scale_fill_gradientn(colours = terrain.colors(30))#+

  #geom_sf(data = regionGeometry, alpha = 0.1)

4.2.2. Threshold value for defining good ecological condition (if relevant)

4.2.3. Spatial resolution and validity

5. Uncertainties

6. References

7. Datasets

7.1 Dataset A

7.2. Dataset B

7.3. Covariates

Write a summary of the covariate names and how they were obtained in a table. Have an Yes / No indicating whether they were used to fit pollinator or plant ISDM.

The environmental covariates used for the model is presented in Figure 3.

Figure 3: Covariates used to fit the ISDM.

8. Spatial units

9. Analyses

We fitted an ISDM using the point process framework [REF] to the merged pollinator presence-only (PO) and presence-absence (PA) datasets. ISDMs are various models that are linked together by a shared ecological process.

Using the poisson process framework, we model the presence-only data as a Poisson point process model [REF] with mean intensity \(\lambda_i(s)\) for each pollinator \(i \in \{bees, butterflies, hoverflies\}\). This mean intensity is modelled as: \[\begin{equation} \label{eq:int} \ln(\lambda_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) + \eta(s), \end{equation}\] where \(\beta_{0i}\) is the pollinator-specific intercept, \(\beta_{ji}\) is the pollinator-specific effect of covariate \(j\), \(X_j(s)\) is the \(j^{th}\) covariate field, \(\omega_i(s)\) is the pollinator-specific spatial autocorrelation field and \(\eta(s)\) is the bias field. \(\omega_i(s)\) and \(\eta(s)\) are modeled as zero mean Gaussian random field (i.e. \(\omega_i(s) \sim N(0, \Sigma)\), where \(\Sigma\) is a Mat’ern covariance matrix with variance \(\sigma_{\omega}^2\) and range \(\kappa\) and \(\eta(s) \sim N(0, \Sigma_\eta)\), where \(\Sigma_\eta\) is a Mat’ern covariance matrix with variance \(\sigma_{\eta}^2\) and range \(\kappa_\eta\)).

We model the PA with a logistic regression model. Let \(y_i(s)\) be a binary value at location \(s\) with \(0\) indicating absence of pollinator \(i\) and \(1\) indicating presence of pollinator \(i\). Then \(y_i(s) \sim \text{Bernoulli}(\Psi_i(s))\) with: \[ cloglog(\Psi_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) \] where the parameters are defined in equation \(\ref{eq:int}\).

All the ISDMs were fitted using the R-package PointedSDMs [REF].

9.1. ISDM for pollinators

Using the model framework defined above, we fitted the ISDM to the pollinator dataset obtained from GBIF.

model <- PointedSDMs::startSpecies(PointsBeesAndButterfliesAndHoverflies, # list of pollinator dataset (containing both mergedDatasetPA and mergedDatasetPO) 
                                   Boundary = regionGeometry, # boundary of the study
                                   Projection = newCrs, # projection
                                   Mesh = meshForProject, #mesh used for the model
                                  speciesEnvironment = TRUE, # Should we have pollinator specific covariate effect
                                  speciesIntercept = TRUE, # TRUE treats the intercept as a random effect, instead of constrained as with a fixed effect 
                                  pointsIntercept = FALSE, #should we include intercept for each dataset
                                  pointCovariates = FALSE, #do we have covariates for the presence-only data
                                   responsePA = 'individualCount', # column name of the response values for presence-absence data
                                  # trialsPA = 'trials',
                                   spatialCovariates = envCovsForModel, # raster of spatia covariates
                                   speciesName = 'Taxon', # Names of the species in the datasets
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")   # unique spatial field per species, but they share the same hyperparameters.

# Add second bias field for PO data
model$addBias(datasetNames = 'mergedDatasetPO')

Priors

We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))

# hyper parameters of the spatial field (shared across species)
model$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(0.6, 0.1),  # SD of field variation; 
                     prior.range = c(5, 0.1))  # range of spatial correlation; 


model$specifySpatial(Bias = TRUE, # Change the prior
                                 prior.sigma = c(1, 0.1),
                                 prior.range = c(5, 0.1))

model$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))), 
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

Fit model and make predictions

We fit the model and make predictions of the pollinator intensity within the study region.

modelRun <- PointedSDMs::fitISDM(data = model, 
                            options = modelOptions)
  
  # predictions for this model
  individualDatasetPreds <- predict(modelRun,
          data = ppxl,
          predictor = TRUE,
          n.samples = 100)

9.2. ISDM for plant species

Modelling all the \(54\) species required memoey allocations we did not have, so we split the species into groups with \(10\) species in each group. We fitted independent ISDMs for each of the groups.

# PointedSDMs takes a longer time to fit for many species
# The trick to to break it into smaller groups
# The split will be done by the number of present speccies in each location
# I will split it nGroups time
sortedSpecies <- table(asoDatasf$acceptedScientificName)%>%
  as.data.frame()%>%
  dplyr::arrange(desc(Freq))%>%
  select(Var1)%>%
  c()

# Set the number of groups
nGroups <- 10

#split the species into groups of 10 species in each
plantSpeciesGroup <- split(sortedSpecies$Var1, 
                           seq(1, 
                               ceiling(length(sortedSpecies$Var1)/nGroups)))

Similar to the model the pollinator ISDM, we fitted an ISDM to the \(54\) plant species. The ISDM has species-specific intercept, covariate effect and spatial field (but all the species share the same hyperparameters). We also added a spatial bias field to the observation model for the presence-only dataset.

speciesModelShared <- PointedSDMs::startSpecies(formatPlantData,
                                                Boundary = regionGeometry, 
                                   Projection = newCrs, 
                                   Mesh = meshForProject,
                                   responsePA = 'individualCount',
                                   speciesEnvironment = TRUE,
                                   speciesIntercept = TRUE,
                                   spatialCovariates = envCovsForPlantSpeciesModel, 
                                   speciesName = 'simpleScientificName',
                                   pointsIntercept = FALSE,
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")  

# Add bias spatial field for the PO dataset
speciesModelShared$addBias(datasetNames = 'mergedPlantsPO')

Priors We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))

# hyper parameters of the spatial field (shared across species)
speciesModelShared$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(1, 0.1),  
                     prior.range = c(5, 0.1))  


speciesModelShared$specifySpatial(Bias = TRUE, # Change the prior
                     prior.sigma = c(0.6, 0.1),
                     prior.range = c(5, 0.1))

# prior for random effects (mesh nodes of spatial field and species intercepts)
speciesModelShared$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))),  
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

Model fitting and predictions

# specify the model options for INLA
modelOptions <- list(control.inla = list(int.strategy = 'eb', 
                                         diagonal = 0.1, 
                                         cmin = 0), 
                     verbose = FALSE, 
                     safe = TRUE)

# Species-specific spatial effects model
speciesSharedEst <- PointedSDMs::fitISDM(data = speciesModelShared, 
                            options = modelOptions)

# I proceed with the prediction of occupancy probabilities
# over the entire region
individualDatasetPreds <- predict(speciesSharedEst,
                                  data = ppxl,
                                  predictor = TRUE,
                                  n.samples = 10)

9.3 Species Interractions

  load("../data/webPlot.RData")
  #interractionMatrix <- readr::read_csv(paste0(dataFolder,"/interractionsData/interractionMatrix.csv"))
  
    bipartite::plotweb(WebReady[[1]])

Webplot of plant species and pollinator interractions.

9.4. Diversity Indices

To define the diversity of the pollinators, we estimate the pollinator intensity given their interaction with the plants species within the location \(s\). This is estimated as: \[\begin{equation} \begin{split} \text{Intensity for intensity} &\propto \text{Estimated pollinator intensity} \times \text{Interraction weight} \times \text{plant species occurrence probability}\\ \implies \lambda^{\star}_i(s) &= \frac{\lambda_i(s) \times w_{ik} \times \Psi_k(s)}{\sum_k \lambda_i(s) \times w_k \times \Psi_k(s)}, \end{split} \end{equation}\] where \(w_k\) is the weight of the interaction between pollinator \(i\) and plant species \(k\), \(\Psi_k(s)\) is the occurrence probability of plant species \(k\) and \(\lambda_i(s)\) is the intensity of pollinator \(i\).

We estimated the Hill’s diversity indices for the pollinators. The Hill’s diversity indices are defined as: \[ H(s) = r_i(s) \times log(r_i(s)); \] where \(r_i(s) = \frac{\lambda^{\star}_i(s)}{\sum_{i} \lambda^{\star}_i(s)}\).

10. Results

Figure 4: Covariate effects of pollinators.
Figure 5: Pollinator Predictions.
Figure 6: Standard deviation of Pollinator Predictions.
Figure 7: Plant species fixed effects.
Figure 8: Plant species land cover effects.
Figure 9: Plant species Predictions.
Figure 10: Pollinator Diversity.

11. Export file